11 research outputs found

    Detecting LLM-Generated Text in Computing Education: A Comparative Study for ChatGPT Cases

    Full text link
    Due to the recent improvements and wide availability of Large Language Models (LLMs), they have posed a serious threat to academic integrity in education. Modern LLM-generated text detectors attempt to combat the problem by offering educators with services to assess whether some text is LLM-generated. In this work, we have collected 124 submissions from computer science students before the creation of ChatGPT. We then generated 40 ChatGPT submissions. We used this data to evaluate eight publicly-available LLM-generated text detectors through the measures of accuracy, false positives, and resilience. The purpose of this work is to inform the community of what LLM-generated text detectors work and which do not, but also to provide insights for educators to better maintain academic integrity in their courses. Our results find that CopyLeaks is the most accurate LLM-generated text detector, GPTKit is the best LLM-generated text detector to reduce false positives, and GLTR is the most resilient LLM-generated text detector. We also express concerns over 52 false positives (of 114 human written submissions) generated by GPTZero. Finally, we note that all LLM-generated text detectors are less accurate with code, other languages (aside from English), and after the use of paraphrasing tools (like QuillBot). Modern detectors are still in need of improvements so that they can offer a full-proof solution to help maintain academic integrity. Further, their usability can be improved by facilitating a smooth API integration, providing clear documentation of their features and the understandability of their model(s), and supporting more commonly used languages.Comment: 18 pages total (16 pages, 2 reference pages). In submissio

    Student Usage of Q&A Forums: Signs of Discomfort?

    Full text link
    Q&A forums are widely used in large classes to provide scalable support. In addition to offering students a space to ask questions, these forums aim to create a community and promote engagement. Prior literature suggests that the way students participate in Q&A forums varies and that most students do not actively post questions or engage in discussions. Students may display different participation behaviours depending on their comfort levels in the class. This paper investigates students' use of a Q&A forum in a CS1 course. We also analyze student opinions about the forum to explain the observed behaviour, focusing on students' lack of visible participation (lurking, anonymity, private posting). We analyzed forum data collected in a CS1 course across two consecutive years and invited students to complete a survey about perspectives on their forum usage. Despite a small cohort of highly engaged students, we confirmed that most students do not actively read or post on the forum. We discuss students' reasons for the low level of engagement and barriers to participating visibly. Common reasons include fearing a lack of knowledge and repercussions from being visible to the student community.Comment: To be published at ITiCSE 202

    Opportunities for Adaptive Experiments to Enable Continuous Improvement that Trades-off Instructor and Researcher Incentives

    Full text link
    Randomized experimental comparisons of alternative pedagogical strategies could provide useful empirical evidence in instructors' decision-making. However, traditional experiments do not have a clear and simple pathway to using data rapidly to try to increase the chances that students in an experiment get the best conditions. Drawing inspiration from the use of machine learning and experimentation in product development at leading technology companies, we explore how adaptive experimentation might help in continuous course improvement. In adaptive experiments, as different arms/conditions are deployed to students, data is analyzed and used to change the experience for future students. This can be done using machine learning algorithms to identify which actions are more promising for improving student experience or outcomes. This algorithm can then dynamically deploy the most effective conditions to future students, resulting in better support for students' needs. We illustrate the approach with a case study providing a side-by-side comparison of traditional and adaptive experimentation of self-explanation prompts in online homework problems in a CS1 course. This provides a first step in exploring the future of how this methodology can be useful in bridging research and practice in doing continuous improvement

    Impact of Guidance and Interaction Strategies for LLM Use on Learner Performance and Perception

    Full text link
    Personalized chatbot-based teaching assistants can be crucial in addressing increasing classroom sizes, especially where direct teacher presence is limited. Large language models (LLMs) offer a promising avenue, with increasing research exploring their educational utility. However, the challenge lies not only in establishing the efficacy of LLMs but also in discerning the nuances of interaction between learners and these models, which impact learners' engagement and results. We conducted a formative study in an undergraduate computer science classroom (N=145) and a controlled experiment on Prolific (N=356) to explore the impact of four pedagogically informed guidance strategies and the interaction between student approaches and LLM responses. Direct LLM answers marginally improved performance, while refining student solutions fostered trust. Our findings suggest a nuanced relationship between the guidance provided and LLM's role in either answering or refining student input. Based on our findings, we provide design recommendations for optimizing learner-LLM interactions

    Spatial Skills and Demographic Factors in CS1

    Get PDF
    Motivation Prior studies have established that training spatial skills may improve outcomes in computing courses. Very few of these studies have, however, explored the impact of spatial skills training on women or examined its relationship with other factors commonly explored in the context of academic performance, such as socioeconomic background and self-efficacy. Objectives In this study, we report on a spatial skills intervention deployed in a computer programming course (CS1) in the first year of a post-secondary program. We explore the relationship between various demographic factors, course performance, and spatial skills ability at both the beginning and end of the term. Methods Data was collected using a combination of demographic surveys, existing self-efficacy and CS1 content instruments, and the Revised PVST:R spatial skills assessment. Spatial skills were evaluated both at the beginning of the term and at the end, after spatial skills training was provided. Results While little evidence was found to link spatial skills to socioeconomic status or self-efficacy, both gender identity and previous experience in computing were found to be correlated to spatial skills ability at the start of the course. Women initially recorded lower spatial skills ability, but after training, the distribution of spatial skills scores for women approached that of men. Discussion These findings suggest that, if offered early enough, spatial skills training may be able to remedy some differences in background that impact performance in computing courses

    ABScribe: Rapid Exploration of Multiple Writing Variations in Human-AI Co-Writing Tasks using Large Language Models

    Full text link
    Exploring alternative ideas by rewriting text is integral to the writing process. State-of-the-art large language models (LLMs) can simplify writing variation generation. However, current interfaces pose challenges for simultaneous consideration of multiple variations: creating new versions without overwriting text can be difficult, and pasting them sequentially can clutter documents, increasing workload and disrupting writers' flow. To tackle this, we present ABScribe, an interface that supports rapid, yet visually structured, exploration of writing variations in human-AI co-writing tasks. With ABScribe, users can swiftly produce multiple variations using LLM prompts, which are auto-converted into reusable buttons. Variations are stored adjacently within text segments for rapid in-place comparisons using mouse-over interactions on a context toolbar. Our user study with 12 writers shows that ABScribe significantly reduces task workload (d = 1.20, p < 0.001), enhances user perceptions of the revision process (d = 2.41, p < 0.001) compared to a popular baseline workflow, and provides insights into how writers explore variations using LLMs

    Practice report: six studies of spatial skills training in introductory computer science

    Get PDF
    We have been training spatial skills for Computing Science students over several years with positive results, both in terms of the students’ spatial skills and their CS outcomes. The delivery and structure of the training has been modified over time and carried out at several institutions, resulting in variations across each intervention. This article describes six distinct case studies of training deliveries, highlighting the main challenges faced and some important takeaways. Our goal is to provide useful guidance based on our varied experience for any practitioner considering the adoption of spatial skills training for their students

    Computing Maximal Lyndon Substrings of a String

    No full text
    There are two reasons to have an efficient algorithm for identifying all right-maximal Lyndon substrings of a string: firstly, Bannai et al. introduced in 2015 a linear algorithm to compute all runs of a string that relies on knowing all right-maximal Lyndon substrings of the input string, and secondly, Franek et al. showed in 2017 a linear equivalence of sorting suffixes and sorting right-maximal Lyndon substrings of a string, inspired by a novel suffix sorting algorithm of Baier. In 2016, Franek et al. presented a brief overview of algorithms for computing the Lyndon array that encodes the knowledge of right-maximal Lyndon substrings of the input string. Among those presented were two well-known algorithms for computing the Lyndon array: a quadratic in-place algorithm based on the iterated Duval algorithm for Lyndon factorization and a linear algorithmic scheme based on linear suffix sorting, computing the inverse suffix array, and applying to it the next smaller value algorithm. Duval&rsquo;s algorithm works for strings over any ordered alphabet, while for linear suffix sorting, a constant or an integer alphabet is required. The authors at that time were not aware of Baier&rsquo;s algorithm. In 2017, our research group proposed a novel algorithm for the Lyndon array. Though the proposed algorithm is linear in the average case and has O(nlog(n)) worst-case complexity, it is interesting as it emulates the fast Fourier algorithm&rsquo;s recursive approach and introduces &tau;-reduction, which might be of independent interest. In 2018, we presented a linear algorithm to compute the Lyndon array of a string inspired by Phase I of Baier&rsquo;s algorithm for suffix sorting. This paper presents the theoretical analysis of these two algorithms and provides empirical comparisons of both of their C++ implementations with respect to the iterated Duval algorithm

    Computing Maximal Lyndon Substrings of a String

    No full text
    There are two reasons to have an efficient algorithm for identifying all right-maximal Lyndon substrings of a string: firstly, Bannai et al. introduced in 2015 a linear algorithm to compute all runs of a string that relies on knowing all right-maximal Lyndon substrings of the input string, and secondly, Franek et al. showed in 2017 a linear equivalence of sorting suffixes and sorting right-maximal Lyndon substrings of a string, inspired by a novel suffix sorting algorithm of Baier. In 2016, Franek et al. presented a brief overview of algorithms for computing the Lyndon array that encodes the knowledge of right-maximal Lyndon substrings of the input string. Among those presented were two well-known algorithms for computing the Lyndon array: a quadratic in-place algorithm based on the iterated Duval algorithm for Lyndon factorization and a linear algorithmic scheme based on linear suffix sorting, computing the inverse suffix array, and applying to it the next smaller value algorithm. Duval&rsquo;s algorithm works for strings over any ordered alphabet, while for linear suffix sorting, a constant or an integer alphabet is required. The authors at that time were not aware of Baier&rsquo;s algorithm. In 2017, our research group proposed a novel algorithm for the Lyndon array. Though the proposed algorithm is linear in the average case and has O(nlog(n)) worst-case complexity, it is interesting as it emulates the fast Fourier algorithm&rsquo;s recursive approach and introduces &tau;-reduction, which might be of independent interest. In 2018, we presented a linear algorithm to compute the Lyndon array of a string inspired by Phase I of Baier&rsquo;s algorithm for suffix sorting. This paper presents the theoretical analysis of these two algorithms and provides empirical comparisons of both of their C++ implementations with respect to the iterated Duval algorithm

    Building Recommendations for Conducting Equity-Focused, High Quality K-12 Computer Science Education Research

    No full text
    To investigate and identify promising practices in equitable K-12 computer science (CS) education, the capacity for education researchers to conduct this research must be rapidly built globally. Simultaneously, concerns have arisen over the last few years about the quality of research that is being conducted and the lack of equity-focused research. In this working group, we will tackle the research question: In what ways can previous research standards inform high-quality, equity-focused K-12 CS education research? We will use existing research and various standards bodies (e.g., European Educational Research Association, Australian Education Research Organisation, CONSORT, American Psychological Association) to synthesize key features in the context of equity-focused K-12 CS education research. We will then vet these attributes with experts who can provide feedback and refine our recommendations and guidelines. Our working group will select the experts using a strata reflecting a diversity of backgrounds and experiences to support our focus on student populations that have been historically marginalized in computing (e.g., low-income students, rural students, girls, students with disabilities). Our recommendations will directly impact future equitable computing education research by providing guidance on conducting high-quality research such that the findings can be aggregated and impact future policy with evidence-based results. While we recognize that different countries and regions may yield differing answers to this question, our recommendations will be robust enough that researchers in each country or region may choose to use those most appropriate to their context
    corecore